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Abstract 



H 

r^ In this paper, a nonparametric estimator is proposed for estimating the Li-median for niulti- 

j^ variate conditional distribution when the covariates take values in an infinite dimensional space. 

H The multivariate case is more appropriate to predict the components of a vector of random 

variables simultaneously rather than predicting each of them separately. While estimating the 

T— I conditional Li-median function using the well-known Nadarya-Waston estimator, we establish 

'^ the strong consistency of this estimator as well as the asymptotic normality. We also present 

f~^ some simulations and provide how to built conditional confidence ellipsoids for the multivariate 

en Li-median regression in practice. Some numerical study in chcmiomctrical real data are car- 

• ried out to compare the multivariate Li-median regression with the vector of marginal median 

/-^ regression when the covariate X is a curve as well as X is a random vector. 

m 

^~~i Keywords: almost sure convergence, confidence ellipsoid, functional data, kernel estimation, small 

K*" balls probability, multivariate conditional Li-median, multivariate conditional distribution. 
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"^ 1 Introduction 

In statistics, researchers are often interested in how a variable response Y may be concomitant 
with an explanatory variable X. Studying the relationship between Y given a new value of the 
explanatory variable X is an important task in non-parametric statistics. For instance, regression 
function provides the mean value that takes Y given X = x. Some other characteristics of the con- 
ditional distribution, such as conditional median, conditional quantiles, conditional mode, maybe 
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quite interesting in practice. Furthermore, it is widely acknowledged that quantiles are more robust 
to outliers than regression function. 

Conditional quantiles are widely studied when the explanatory variable X lies within a finite 
dimensional space. There are many references on this topic (see Gannoun et al. (2003a)). 

During the last decade, thanks to progress of computing tools, there is an increasing number of 
examples coming from different fields of applied sciences for which the data are curves. For instance, 
some random variables can be observed at several different times. This kind of variables, known as 
functional variables (of time for instance) in the literature, allows us to consider the data as curves. 
The books by Bosq (2000) and Ramsay and Silverman (2005)) propose an interesting description 
of the available procedures dealing with functional observations whereas Ferraty and Vieu (2006) 
present a completely non-parametric point of view. These functional approaches mainly rely on 
generalizing multivariate statistical procedures in functional spaces and have been proved to be 
useful in various areas such as chemiomertrics (Hastie and Mallows (1993) and Quintela-del Rio 
and Francisco-Fernandez (2011)), economy (Kneip and Utikal (2001)), climatology (Besse et al. 
(2000)), biology (Kirkpatrick and Heckman (1989)), Geoscience (Quintela-del Rio and Francisco- 
Fernandez (2011)) or hydrology (Ghebana and Ouarda (2011)). These functional approaches are 
generally more appropriate than longitudinal data models or time series analysis when there are, 
for each curve, many measurement points (Rice (2004)). 

In the univariate case (i.e. y G M and X is a functional covariable), among the lot of papers 
dealing with the nonparametric estimation of conditional quantiles, one may cite papers by Gardot 
et al. (2005) which introduced univariate quantile regression with functional covariate and Ferraty 
et al. (2005) estimates conditional quantile by inverting the conditional cumulative distribution 
function. Ezzahrioui and Ould-Said (2008) establish the almost complete convergence and the 
asymptotic normality in the setting of independent and identically distributed (i.i.d.) data as well 
as under a-mixing condition. Dabo-Niang and Laksaci (2012) stated the convergence in L^-norm. 
In the same framework, Laksaci et al. (2009) estimated the conditional quantile nonparametrically, 
by adapting the L^-norm method. Recently Quintela-del Rio and Vieu (2011) have used the same 
approach proposed by Ferraty et al. (2005) to predict future stratospheric ozone concentrations 
and to estimate return levels of extreme values of tropospheric ozone. 

Over the past decades, researchers have shown increasing interest in studying multivariate lo- 
cation parameters such as multivariate quantiles in order to find suitable analogs of univariate 
quantiles that used to construct descriptive statistics and robust estimations of location. In con- 
trast to the univariate case, the order of observations Yi laying in M (with d > 2) is not total. 
Gonsequently, several quantiles-type multivariate definitions have been formulated. The pioneer pa- 
per of Haldane (1948) considered a multivariate extension of the median defined as an M-estimator 
(also called spatial or Li-median). The reader is referred to Serfiing (2002) for historical reviews 
and comparisons. Ghaudhuri (1996) and Koltchinskii (1997) defined the geometric quantile as an 



extension of multivariate quantiles based on norm minimization and on the geometry of multivariate 
data clouds. 

In contrast, relative little attention has been paid to the multivariate conditional quantiles 
(Y € M and X G M'^) and their large sample properties. Cadre (2001) defined the conditional 
Li-median and provided its uniform consistency on a compact subsets of M^. Recently, De Gooijer 
et al. (2006) have introduced a multivariate conditional quantile notion, which extends the definition 
of unconditional quantiles by Abdous and Theodorescu (1992), to predict tails from bivariate time 
series. Cheng and De Gooijer (2007) have generalized the notion of geometric quantiles, defined 
by Chaudhuri (1996), to the conditional setting. They have established a Bahadur- type linear 
representation of the n-th geometric conditional estimator as well as the asymptotic normality in 
the i.i.d. case. 

The purpose of this paper is to add some new results to the non-parametric estimation of the 
conditional Li-median when y is a random vector with values in M while the covariable X take its 
values in some infinite dimensional space J-. As far as we know, this problem has not been studied in 
literature before and the results obtained here are believed to be novel. Moreover, our motivation for 
studying this type of robust estimator is due to its interest in some practical applications. Note also 
that, it would be better to predict all components of a vector of random variables simultaneously in 
order to take into account the correlation between them rather than predicting each of component 
separately. For instance, in EDF (French electricity company) the estimation of the minimum 
and the maximum of the electricity power demand represents an important research issue for both 
economic and security reasons. Because an underestimation of the maximum consumed quantity 
of electricity (especially in winter) may require importation of electricity from other European 
countries with high prices, while an over estimation of this maximum quantitiy may induce a 
negative effect on the electricity distribution network. The estimation of the minimum power 
demand is also an important task for the same reasons. Notice that the minimum and the maximum 
of the electricity power demand are strongly correlated. Thus, it is more appropriate to predict 
these variables simultaneously rather than predicting each of them separately. On the other hand, 
weather variables, like temperature curves, can play a key role to explain the minimum and the 
maximum of power demand. Due to its robust properties, the conditional Li-median may be used 
to solve this prediction problem using a temperature curve as covariate. 

The paper is organized as follows. Section 2 outlines notations and the form of the new esti- 
mator. Section 3 presents the main results concerning the asymptotic behavior of the estimator, 
including consistency, asymptotic normality and evaluation of the bias term. An estimation of the 
conditional confidence region is then deduced. Section 4 is devoted to a simulation study giving an 
example of the estimated confidence region. An application to chemiometrical real data is proposed 
in Section 5, where we compare three approaches: Li-median regression, the vector of marginal 
conditional median and non-functional multivariate median to predict a random vector. The proofs 



of the results in Section 3 are relegated to the Appendix. 

2 Notations and definitions 

Let us consider a random pair {X, Y) where X and Y are two random variables defined on the same 
probability space {Q,A,F). We suppose that Y is M'^- valued and X is a functional random variable 
(f.r.v.) takes its values in some infinite dimensional vector space {T, d{-, •)) equipped with a semi- 
metric d(-,-)- Let x be a fixed point in J- and F{.\x) be the conditional cumulative distribution 
function (cond. c.d.f) of Y given X = x. The conditional Li-median, /i : J^ — > M , of Y given 
X = X, is defined as the miminizer over u of 

arg min E[(||y — m|| — \\Y\\) \ X = x] = arg min / {\\y — u\\ — \\y\\) dF{y \ x). (1) 

MeiR<* u&.<i J 

The general definition (1) does not assume the existence of the first order moment of ||y||. However, 
when Y has a finite expectation, /x(a;) becomes a minimizer over u of E[||y — u\\ \ X = x\. Notice 
that the existence and the uniqueness of ^{x) is guaranteed, for d > 2, provided that the condi- 
tional distribution function F{-\x) is not supported on a single straight line (see theorem 2.17 of 
Kemperman (1987). Hence, uniqueness holds whenever Y has an absolutely continuous conditional 
distribution on M with d > 2. 

Without loss of generality, we suppose in the sequel, that IE||y|| < oo. Therefore for any fixed 
x ^ F, the conditional Li-median ^{x) may be viewed as a minimizer of the function G^ : M.'^ i — > M 
defined, for all u G M'', by 

G''{u):=E[\\Y-u\\\X = x], (2) 

which is assumed to be differentiable and uniformly bounded with respect to u. 

We introduce now some further definitions and notations. Denote by A^ the transpose of the 
matrix A, and let \\A\\ = y/tr{A^ A) be the norm trace. Notice that for any y G M , the function 
y I — > \\y\\ is differentiable everywhere except at z = O^d, one may then define (by continuity 
extension) its derivative as U{y) = y/||y|| when y ^ and V({y) = whenever y = 0. For any y ^ u, 
define 

M{y, u) = {l/\\y - u\\){Id-U{y - u)U\y - n)), 

where I^ is the d x d identity matrix. We denote by 'VuG^{u) the gradian of the function G^{u) 
and by H^{u) its Hessian functional matrix (with respect to u). According to Koltchinskii (1997), 
it is easy to see that 

V„G^(u) = -E pl{Y -u)\X = x] and (3) 

H''{u) =¥.[M{Y,u)\X = x\. (4) 



Notice that H^[u) is bounded whenever E[||y — n||~^|X = x] <oo. According to (1) and (3), the 
conditional Li-niedian may be then inipHcitly defined as a zero with respect to u of the following 
equation: 

VuG^iu) = 0. (5) 

To build our estimator, let {Xi, yi)j=i,...,n be the statistical sample of pairs which are independent 
and identically distributed as {X,Y). Let us denote by 

. ^ A,(X) 

Wn,i[X) - 



Er=iA.(x)' 

the so-called Nadaraya- Watson weights, where Ai{x) = K {d{x,Xi)/h), with K a kernel function, 
h := hn is a sequence of positive real numbers which decreases to zero as n tends to infinity. 
A kernel estimator of the function G^ (u) is given by 

Gniu) = 2^Wn,i{x) WYi - U\\ = ^'-V\n . . •= ^ > (6) 

when the denominator is not equal to 0, where 

1 " 

gn.,((j-lh)= ^.. I ^ r.-n|r^A,(x), for i = l,2 with GlM--=Gl^. (7) 

A kernel estimate of VuG'^(u) may be defined by 

n 
VuGl{u)■.= -Y,WnAx)U{y^-u), tX G M'^. (8) 

i=l 

According to the statement (2), the estimator of the conditional Li-median, /in(x), may be 
viewed as a minimizer over u of the function G^{u), that is 

Unix) = arg min G^('u), (9) 



or as a zero with respect to u of the equation VuGf^{u) = 0. 

Similar to the Fact 2.1.1 in Chaudhuri (1996) and Remark 2.3 in Cheng and De Gooijer (2007), the 
existence of the estimator /x„(x) is guaranteed by the fact that the function u i — > Y17=i 'Wn,iix)\\Yi — 
u\\ explodes to infinity as ||m|| — )• oo. On the other hand, since this function is continuous with 
respect to u, then fj-nix) must be a minimizer over u of Yl'i=iWn^i{x)\\Yi — u\\. Next comes the 
question of uniqueness, since W^ is equipped with the Euclidean norm that is a strictly convex 
Banach space for d > 2, it follows from Theorem 2.17 of Kemperman (1987) that unless all the 
data points Yi, ■ ■ ■ ,Yn fall on a straight line in U.'^, X]"=i Wn,i{x)\\Yi — u\\ must be a strictly convex 
function of u. This guarantees the uniqueness of the minimizer iin{x) in W^, for any d>2. 



3 Main Results 

3.1 Further notations and hypotheses 

Let X be a given point in T and Vx a neighbourhood of x. Denote by B{x^ h) the bah of center 
X and radius h, namely B{x,h) = {x' € J- : d{x,x') < h}. For {i,u) G M x M'^, denote by 
G| (u) = E[||y — n|| I X = x'~\, for x' G J-. Our hypotheses are gathered here for easy reference. 

(HI) i^ is a nonnegative bounded kernel of class C^ over its support [0, 1] such that K{i) > 0. 
The derivative K' exists on [0, 1] and satisfy the condition K'{t) < 0, for all t £ [0, 1] and 
I f^{K^y{t)dt\ < oo for j = 1, 2. 

(H2) For X £ J^, there exists a deterministic nonnegative bounded function g and a nonnegative 
real function tending to zero, as its argument tends to 0, such that 

(i) Fx{h) := F{X G B{x, h)) = <t){r) ■ g{x) + o(0(/i)) as /i ^ 0. 
{ii) There exists a nondecreasing bounded function tq such that, uniformly in s G [0, 1], 



(t){hs) 



ro(s) + o(l), as. hid and, for j > 1, j^{K^ {t))'TQ{t)dt < oo. 



</.(/i) 

(H3) (j) For X G -F, |G^(ti) — G^ {u)\ < cid^{x, x') uniformly in n, for some /3 > and a constant 
ci > 0, whenever x' £Vx, 
{ii) For x' G J-', the Hessian matrix i?^ (u) is continuous in V^: 

sup^'Gi?(x,h) ll^""!^) - ^'''(^)ll = o(l). 
(Hi) For some integer m > 2, G^^{^{x)) < cx) and G^„^{fi{x)) is continuous in V^;. 

{iv) For some integer ttt, > 1 and any {k,j), 1 < k < d, 1 < j < d, E, Ai^AY,fi) \ X) < cx) 
and 

sup |E (A^^^.(y,/i) I x')) - E (7W^^.(y,/i) I x))| = o(l). 

{x':d{x,x')<h} 

(H4) (i) For each x' G -F, sup„ G%^{u) < oo and G%^{u) is continuous in V^: uniformly in u: 

sup sup IG^ln) - G^(n)| = o(l). 

«elR'* {x':d{x,x')<h} 

{ii) For some (5 > and £ G R"^, the real function W^^+^-^l/u) := E [\l^U{Y - ^i)Y+^^ \ X = x'] 
(i = 1, 2 and j = 0, 1) is continuous in Vx- 

(H5) For any i > 1, E p{Y - n) \ d{x,X) = v] =: tpiv)), where t; G M and V' : M -^ M^ is a 
differentiable function such that V^(0) 7^ 0. 

Remark 3.1 Notice that, since d{-, ■) is a semi-metric, we have V'(O) = E \U{Y — n) \ X = x]. As 
a consequence, it follows from, the definition of fi that tp{0) = 0. 
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Comments on the Hypotheses 

The above conditions are fairly mild. Condition (HI) is standard in the context of functional 
non-parametric estimation. Contrarily to the real and vectorial cases (for which we generally 
suppose the strict positivity of the explanatory variable's density, the concentration hypothesis 
(H2)-(i) acts directly on the distribution of the functional random variable rather than on its density 
function. The idea of writing the small ball probability Fx{h) as a product of two independent 
functions g{x) and (/'(/i) was adopted by Masry (2005) who reformulated the Gasser et al. (1998) 
one. This assumption has been used by many authors where g{x) is interpreted as a probability 
density, while (;/>(/i) may be interpreted as a volume parameter. In the case of finite-dimensional 
space, that is J^ = M'^, it can be seen that Fx{h) = C{d)h'^g{x) + o{h'^), where C{d) is the volume 
of the unit ball in W^. Furthermore, in infinite dimensions, there exist many examples fulfilling 
the decomposition mentioned in assumption (H2)-(i) (see Ferraty et al. (2007) and Ezzahrioui and 
Ould-Said (2008) for more details). The function to(-), introduced in assumption (H2)-(ii), plays a 
determinant role in asymptotic properties, in particular when we give the order of the conditional 
bias and the asymptotic variance term. 

Conditions (H3) and (H4) are mild smoothness assumptions on the functionals G^''{u) and 
H^''{u) and continuity assumptions on certain second-order moments. A similar assumption to 
(H3)-(iii) has been supposed in Cheng and De Gooijer (2007) (see condition 6 in their paper). 
Condition (H5) is used to evaluate the bias term. 

3.2 Almost sure consistency 

The following result states the almost surely (a.s.) convergence (with rate) of the functional esti- 
mator G^{u). This result plays an instumental role to prove the almost sure consistency of Hn{x) 
for a fixed x £ T. 

Proposition 3.1 Assumes that conditions (H1)-(H2), (H3)(i) and (H4)(i) hold true and 

logn n(f){h)h'^l^ 

ii) — — — —7- and iii) — )■ as n — t- oo, where p is is given in {H3), (10) 

n(p{h) logn 

lim||„||^oolhl|G"'(n) < oo. (11) 

Then, we have 

sup \GUU) - G-{U)\ = Oa.s{h^) + Oa.s \ \h^A ■ 

Notice that the condition (11) is standard when we deal with the uniform consistency of the 
density function on the whole space (see, for instance. Corollary 2.2 of Bosq (1996)). 
Here then, we give our first result of the conditional Li-median estimator /U„(j;). 



Theorem 3.2 Assume (H1)-(H2), (H3)(i) and (H4)(i) and condition (10) hold true. Then, we 
have 

lim Unix) = fi{x) a.s. (12) 

n— >oo 

3.3 Asymptotic normality 

To state the asymptotic normality of our estimator, some notations are required. Let us first denote 
by 

G» = ^M^ff#^ and V„G^(n) = -&i^(^^-^)^^(^) 



nE(Ai(x)) " "' ' nE(Ai(x)) 

Set /i(x) =: fj, = (^1, . . . , HdY and /U„(a;) =: Hn = (/Un,i) ■ ■ ■ , /^n,d)*- We have by the definition of fin 
that 



^uGlifln 



0. 



Er=iA.(x) 

Obviously the equation (13) is satisfied when the numerator is null. Then, we can say also that 



(13) 



v„G;(.„) = - ^^-'^ffr^"'^-<-' =o. 



Thereafter, one may write 



nE(Ai(x)) 



VuGlii^n) - VuGUf^) = -VuG'Up)- 



(14) 



(15) 



dGl 



For each j G {1, . . . ,d}, Taylor's expansion applied to the real- valued function " implies the 



dun 



existence of ^„(i) = (Cn,i(j), • • • , ^nAj)^ such that 



] 3 f^^-^ 3 K 



\in,k{j) - /"fcl < |Mn,fc(i) - /^fel- 

Define the d x d matrix H^{^n{j)) = iHn^k,ji^n{j)))i<k,j<d by setting 

where, for all u gM.'^ and x £ J-, 

" 1 



j=l 



Yi-u\ 



Sk, 



(Y^^ -u^){Yl' -u'') 



\Yi - u\ 



Aiix) 



nE(Ai(x)) 



Z7=iMkAyu^)M^) 
nE(Ai(x)) 



with 6k J = 1 a k = j and zero otherwise and AikjO^i, u) = [6k j — ' \\y-_ 112 ] 

{k,j)-th. element of the matrix A4{Yi,u). Equation (15) can be then rewritten as 



\Yj — u\\ is the 



(16) 



Equation (16) plays a key role to give the conditional bias and the asymptotic distribution of the 
conditional Li-median estimator /i^. 

Proposition 3.2 Under assumptions (H1)-(H3) and (H4)(i) and condition (10)(i), we have 

WKiUJ)) - ^"(/«)ll = op(l), as n ^ 00. 

Using Remark 4 and Lemma 5.3 of Chaudhuri (1992), we know that both the matrix H^{^) itself 
and its inverse matrix exist whenever d > 2. It follows from this result combined with (16) that, 
for n large enough, fin — fJ- = ~[H^ if^')]'^'^ uGnifJ-) + op(l)- One may then write, for large n that 

V^^M(/i„-/i) = ^^0()^[^"(/^)]"' [- (v„G^(/i) -E [v„Gi^(M)]) -Bnix)] +op(l), (17) 

where Bn{x) = E [v„G^(^) . 

The following proposition gives the order of the conditional bias term Bn{x) = — [H^{fi)]^ Bn{x). 

Proposition 3.3 Under assumptions {HI), {H2) and {H5), and the fact that 
g{x) > and \ f^ (s-R'(s))'ro(s)(is| < 00, we have: 



BJx) 



/i[i7^XA*)]~^VV'(0) 



Afi 



{sKis)yro{s)ds - K{1) + Oa.sX'^) 



where for j = 1,2, Mj = K^{1) - jQ{K^YTo{z)dz. 

The Theorem below gives the asymptotic normality of our estimator. 

Theorem 3.3 Suppose assumptions (H1)-(H5) and condition (10) (i) hold. 
If {n(j){h)) '"^ — )• 00, for some (5 > 0, then: 

{i) ^/^h) (^„(x) - ll{x) - Bn{x)) A Nd (0, r-(;u)) , 

where 



r-(^) 



M2 



M(g{x] 



[H'^{f,r'^^{f^)[H-^fi)r' 



and 



S^(/i) = E {U{Y - fi) U\Y -ii)\X = x]. 



(a) If in addition we impose the following stronger conditions on the bandwidth hn-' 

^Jn(f){h)h — ;• as n ^ oo, 
one gets 

^J^^m (/^n(x) - ^Ji{x)) A Md (0, r-(/x)) . 

Remark 3.4 . (i) Notice that the constants Mi and M2 are strictly positive. Indeed making use 
of the condition {HI) and the fact that the function to(-) is nondecreasing, it suffices to perform a 
simple integration by parts. Also, from the point that the conditional distribution Y given X = x 
is absolutely continuous, we know that S^(/i) is definite positive matrix. 

(a) Whenever T = M.'^ , s > 1, and if the probability density of the random variable X , say gs{-), is 
of class C^ , then (j){h) = V{s)h'', where V{s) is the volume of the unit ball ofM.'^. In such case, the 
asymptotic variance expression takes the form 

S9s[x) fj^x{u)u'-^du] 

In such case the central limit theorem has the form given in the above theorem with convergence 
rate (n/i^)-*^'^. Notice that in the finite dimensional case, the function (j){h) could decrease to zero 
as h ^ exponentially fast and the convergence rate becomes effectively {n(f>{h)Y''^ . This fact may 
be used to solve the problem of the curse of dimensionality (see Masry (2005), for details). As an 
example, consider in an infinite dimensional space setting, the random process defined by 

Xt = et + Wt, < i < 1, 

where 9 is a J\f{0, l)-random variable independent of the Winer process W = {Wt : < t < 1}. 
It is well-known (see Lipster and Shiryayev (1972)) that the distribution vx of X is absolutely 
continuous with respect to the Wiener measure ux, which admets a Radon-Nikodym density f{x). 
In this case, hypothesis (H2)(i) is satisfied with (t){h) = -exp(— ^p-) (see Laib and Louani (2011) 
for details). The convergence rate in Theorem 3.3 being 0{n 2 ) (with < a < 1/2) by taking 

hn '■= n = ^- . 

2\/2logn" 

Observe now in Theorem 3.3 that the hmiting variance contains the unknown function g{x), 
therefore the normahzation depends on the function (j) which is not identifiable exphcitly. To make 
this result operational in practice, we have to estimate the quantities S, H and tq. 
For this purpose, we estimate the conditional variance matrix S^(/i) of 'VuG^{fi) by 



Kil^n) = Y. '^n,^{x) U{Y, - ^„) U\Y^ 



/^ri 
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and the matrix H^{fj,) by 

n 
i=l 

Making use of the decomposition of F^iu) in {H2){i), one may estimate tq{u) by 

' 4 = 1 

Subsequently, for a given kernel K, the quantities Mi and M2 are estimated by Mi^„ and M2.„ 
respectively replacing tq by Tn in their respective expressions. 

Corollary 3.5 below, which is a slight modification of Theorem 3.3, allows to obtain usefuU form 
of our results in practice. 

Corollary 3.5 Assume that conditions of Theorem 3.3 hold true, K' and {K'^)' are integrable 
functions. If in addition we suppose that 

nFx{h) — > 00 and h^{nFx{h)Y^'^ — > 0, as n ^ 00, 
where (3 is specified in the condition {H3), then, for any x ^ IF such that g{x) > 0, we have 

-^^,fF~{h) [^U^^)]-'/' H^{p„) (^„(x)-M2^)) AaA(0,I,). 

3.4 Building Conditional confidence region of /i(x) 
From Corollary 3.5, we can easily see that 

where 

Then, the asymptotic 100(1 — a)% (a G (0, 1)) conditional confidence region for //(x) is given by 

{^Jin{x) - Kx)Y [Kif^nT' (^n(x) - fl{x)) < xl(a), (18) 

where Xdi'^) denotes the 100(1 — Q)-th percentile of a chi-squared distribution with d degrees of 
freedom. 

4 Numerical study 

This section is divided in two parts, in the first one we are interesting in the estimation of conditional 
confidence ellipsoid of the multivariate Li-median regression. The second part is devoted to an 
application to chemiometrical real data and it consists in predicting a three-dimensional vector. 
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4.1 Simulation example 

Let us consider a bi-dimensional vector Y = (Yi,Y2) G M^ and X{t) is a Brownian motion trajec- 
tories defined on [0, 1]. The eigenfunctions of the covariance operator of X are known to be (see 
Ash and Gardner (1975)), for j = 1,2,... 

fj{t) = V2sm{{j-0.5)7Tt}, tG [0,1]. 

Let (/i(t))te[o.i] (resp. (/2(i))te[o.i]) be the first (resp. the second) eigenfunction corresponding to 
the first (resp. second) greater eigenvalue of the covariance operator oi X . It is well known that 
/i(i) and /2(i) are orthogonal by construction, i.e. < /i,/2 >:= /q fi{t)f2{t) = 0. 
We modelize then the dependence between Y and X by the following model: 



. yi 



Y' 



jih{t)X{t)dt + e 
jlf2{t)X(t)dt + e 



where e is a standard normal random variable. 





Figure 1: Sample of 200 simulated couples of observations (^j, Yj)j=i^...^200- The left box contains 
the covariates Xi and in the right one we present their associated vectors Yj. 

We have simulated n = 200, 700 independent realizations (Xi, Yj), i = 1, . . . ,n. To deal with the 
Brownian random functions Xj(i), their sample were discretized by 100 points equispaced in [0, 1]. 
In Figure 1, we plot a 200 simulated couples (-'^i, Yj)j=i_..,_2oo as described above. The left box 
contains the covariates Xi and in the right one we present the associated vectors Yj = {Y^ , Y!^). 

We aim to assess, for a fixed curve X = x, the performance of the asymptotic conditional 
confidence ellipsoid given by (18) in finite sample. For that we have first to estimate //(x). Three 
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parameters should be fixed in this step: the kernel K^ the bandwidth h and the semimetric (i(-, •) 
which measure the similarity between curves. 

Choice of the kernel: there are many possible density kernel functions. Specialists in non- 
parametric estimation agree that the exact form of the kernel function does not greatly affect the 
final estimate with regard to the choice of the bandwidth. In this section, the so-called Gaussian 
kernel will be used, which is defined by K{u) = (27r)~-^'^ exp(— «^/2), for u G M. 
Choice of the bandwidth hn'- the bandwidth determines the smoothness of the estimator. 
The problem of the choice of the bandwidth has been widely studies in non-parametric literature. 
Recently Rachdi and Vieu (2007) have proposed a data-driven criterion for choosing this smoothing 
parameter. The proposed criterion can be formulated in terms of a functional version of cross- 
validation ideas. Antoniadis et al. (2009) treated the same problem in the context of time series 
prediction. In the following, the bandwidth hn is selected by Li cross-validation method: 



hn,opt = argminV] llYi - /i(_j)(xi)| 



(19) 



Choice of the semi-metric (i(-, •): because of the roughness of our covariate curves we chose a 
semi- metric computed with the functional principal components analysis with dimension q = 2. 




mijl 



Figure 2: Confidence ellipsoid of /i(x) when n = 200 (solid lines) and n = 700 (dashed lines); the 
centers of the ellipses at (/i^(x), ^^(x)) are denoted by triangle (n=200) and cross (n=700). 



In Figure 2, we plot the 95% confidence ellipses of fJ-{x) when x = Oj^. We can remark from 
Figure 2 that the lengths of the major and the minor axes of the confidence ellipse decrease when 
the sample size n increases. Similar results were obtained for other sample sizes n and values of 
the curve x. 



13 



4.2 Application to Chemiometrical data prediction 

The purpose of this section is to apply our method based on multivariate Li-median regression to 
some chemiometrical real data and to compare our results to those obtained by other definitions of 
conditional median studied in literature. For that, we used a sample of spectrometric data available 
on the web site: http://lib.stat.cmu.edu/datasets/tecator. We have a sample of n = 215 pieces of 
meat and for each unit i, we observe one spectrometric discretized curve Xj(A) which corresponds to 
the absorbance measured at a grid of 100 wavelengths (i.e. Xi{X) = {Xi{Xi),Xi{\2), ■ ■ ■ , Xj(Aioo)))- 
Figure (3) plots the spectrometric curves. Moreover, for each unit i, we have at hand its Moisture 
content (Y^), Fat content (V^) and Protein content {Y^) obtained by analytical chemical processing. 




9BO 
Wavelengths 



Figure 3: The 215 spectrometric curves. 

Let us denote by Y = (moisture, fat, proteinY := {Y^,Y'^,Y'^Y the vector of specific chemical 
contents of meat. Given a new spectrometric curve X„e^(A), our purpose is to predict simultane- 
ously the corresponding vector of chemical contents Y using the multivariate Li-median regression. 
Obtaining a spectrometric curve is less expensive (in terms of time and cost) than analytical chem- 
istry needed for determining the percentage of chemical contents. So, it is an important economic 
challenge to predict the hole vector Y from the spectrometric curve. 

Let us consider 215 observations (Xi(A), Yi), . . . , (X2i5(A), Y215) split into two samples: learn- 
ing sample (160 observations) and test sample (55 observations). We compare the following three 
methods, based on multivariate conditional median, to predict the vector of chemical contents Y 
of the test sample. In the following three approaches, we choose the quadratic kernel K defined by: 



Kiu) 



{i) Non- functional approach (NF) 
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Figure 4: The sample of 215 piece of meat. 

This method is based on the definition of conditional spatial median studied by Gannoun et al. 
(2003b) and Cheng and De Gooijer (2007). This approach does not consider the covariate X as 
a function but a vector of dimension 100 while the response variable Y is a vector. For each 
i = 1, . . . , 160 in the learning sample, the i*^ vector Yj is predicted as follow: 



r'iXr), 



where 



160 



/2^^(X,) = argmin^<f (X,)||y, 



u\\ 



and <f (X.) = K (\^ 






K 



Xi — Xj 



are the so-called Nadaraya- Watson weights. 



For the choice of the bandwidth /i„, Cheng and De Gooijer (2007) gave the exact expression of the 
optimal bandwidth that minimizes the asymptotic mean square error. In this case /i„ is of the rate 
^(-i/104+e)^ where e > is a sufficiently small constant. 

(m) Vector Coordinate Conditional Median (VCCM) 

This approach supposes that the covariate X is considered as functional. For each i = 1, . . . , 160 in 
the learning sample, we predict each component of its vector response Yj by the one-dimensional 
conditional median. Then we obtain the vector of coordinate conditional medians (VCCMs) defined 
as 

Y, = {rL\Xi),ri^{Xi),ji\x^)), 

where each component Jl^{Xi) = {F^)^^(l/2 \ Xi) is the one-dimensional conditional median 
estimator. 
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F^{- I Xi) is the conditional distribution function estimator of the component Y^ given X = 
Xi. Ferraty and Vieu (2006), p. 56, have proposed a Nadaraya- Watson kernel estimator of the 
conditional distribution, F^{- \ X = Xi), when covariate takes values in some infinite dimensional 
space. This estimator is given by 

160 160 

F\y^ \X = X,) = Y,\Y^<y^K{d{Xi,Xk)/hn)/j2K{d{X^,Xk)/K), y' G M. 

fc=l k=l 

To apply this approach, we used the Ferraty and Vieu's R/routine funopare.quantile.lcv^ to esti- 
mate 'jl^{Xi). The optimal bandwidth is chosen by the cross-validation method on the k nearest 
neighbours (see Ferraty and Vieu (2006), p. 102 for more details). 

{in) Conditional Multivariate Median (CMM) 

The approach that we propose here supposes the covariate X is a curve and the response y is a 
vector. For each i = 1, . . . , 160 in the learning sample we take 

Y^ = KXi), 

where 

160 

/2(Xi) = m:gm.\ii'Y]wnj{Xi)\\Yj - u\\. (20) 

To estimate the conditional multivariate median, ju(Xj), we have adapted the algorithm proposed 
by Vardi and Zhang (2000) to the conditional case and used the function spatial. median from 
the R package ICSNP. As in the previous approach, the optimal bandwidth is chosen by the cross- 
validation method on the k nearest neighbours. 

A common evaluation procedure: 

We have adapted, to the multivariate case, the algorithm proposed by Attouch et al. (2009) 
and Ferraty and Vieu (2006), p. 103) in order to get the optimal smoothing parameter /i„ for each 
Xi in the test sample. 

Stepl. We compute the kernel estimator Jl{Xj) (resp. fi (Xj)), for all j by using the training sample. 
Step2. For each Xi in the test sample, we set i* = a.rgnimj^iiQQ d{Xi,Xj). 
Step3. For each i = 161, . . . , 215, we take 

MX^) = ■^l{Xi^) and /2'=(X,) = /I^(Xi^). 



^Available at the website www.lsp.ups-tlse.fr/staph/npfda. 
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CMM 






VCCM 






NF 




Mean 


Qo.25 


Qo.5 


Qo.75 


Mean 


Qo.25 


Qo.5 


Qo.75 


Mean 


Qo.25 


Qo.5 


Qo.75 


Moist. 


1.301 


0.479 


1.100 


2.202 


1.776 


0.460 


1.879 


2.383 


7.222 


1.663 


6.374 


11.44 


Fat 


1.565 


0.430 


1.500 


2.401 


2.343 


0.925 


1.716 


2.867 


9.758 


2.328 


8.4 


15.24 


Prot. 


1.125 


0.300 


0.800 


1.437 


1.313 


0.518 


1.182 


1.806 


2.446 


0.787 


2.329 


3.394 


R{Y) 


2.638 


1.349 


2.530 


3.623 


3.561 


1.877 


2.909 


3.799 


12.6 


3.523 


10.6 


19.27 



Table 1: Distribution of absolute errors for Moisture, Fat and Protein and global estimation error 
of the vector Y. 



The used bandwidth for each curve Xi in the test sample is the one obtained for the nearest 
curve in the learning sample. Because the spectrometric curves presented in Figure (3) are very 
smooth, we can choose as semi-metric d{-, •) the L2 distance between the second derivative of the 
curves. This choice has been made by Attouch et al. (2009) and Ferraty et al. (2007) for the same 
spectrometric curves. 

Both (CMM) and (NF) methods take into account the covariance structure between variables 
of of the vector Y. In fact, the correlation coefficients between Yi = moisture, Y2 = fat and 
I3 = protein are given by pi^2 = —0.988, pi^3 = 0.814 and /92,3 = —0.860. As we can see moisture, 
fat and protein contents in meat are strongly correlated then it will be more appropriate to predict 
these variables simultaneously rather than each one separately. 
To compare (CMM), (NF) and (VCCM) methods, we are based on the following criterias: 

• The Absolute Error (AE) gives idea about the prediction of each component of Y 

AEf = \Y/ - d{Xi)\, Vi = 161, ... ,215 and j = 1,2,3. 

• A global criteria {R) gives idea about error made to predict the vector Yj (for i = 161, . . . , 215) 

R{Y^) = \\Yi - C{X,)\\eucI 

where C := {C^,C'^,C^Y represents the estimator of each component of the vector Y obtained by 
(VCCM), (NF) or (CMM) method. 

We can conclude from table 1 that our method is more appropriate to predict meat components than 
(VCCM). In fact, the (VCCM) approach predicts each component of Y separately using conditional 
univariate median. This method supposes independence of the components of Y and doesn't take 
into account the correlation structure between variables. The Non-Functional approach gives the 
most important prediction errors and this is because of the dimension of the covariate (100 in this 
case). This problem is well-known in nonparametric estimation as curse of dimensionality. Taking 
into account the functional aspect of the covariate seems to be necessary in such case. 
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5 Concluding remarks 

In this paper, we have introduced a kernel-based estimator for the Li-median of a multivariate con- 
ditional distribution when covariates take values in an infinite-dimensional space. Prediction using 
the least square estimates of regression parameters is highly sensitive to outlying points. Therefore, 
there is no doubt that conditional Li-median can be used to make prediction. We have shown that 
our estimator is well adapted to predict a multivariate response vector. In fact, in contrast to the 
Vector Coordinate Conditional Median method, the multivariate conditional Li-median takes into 
account the inter-dependance of the coordinates of the response vector. Asymptotic results, i.e., 
almost sure consistency and asymptotic normality, has been given under some regularity conditions. 
Many extensions can be given to this work. For instance, the same type of theoretical results could 
be obtained in a non-independence framework (e.g. mixing dependence). Furthermore, it is well 
known that quantiles are very useful tools to detect outliers and to modelize the dependence of 
the covariates in lower and upper tails of the response distribution. In future work, we aim to 
generalize our study to the multivariate quantiles regression when covariates take values in some 
infinite dimensional space. 

Appendix: Proofs 

In order to prove our results we have to introduce some further notations. Let 

Gl,{u) = E {Gl,{u)) := ^^^E [^i " ^11 Ai(x)] , 

and define the bias of G^{u) as 

BUu) = Gl,{u)-G^iu). 

Consider now the following quantities 

Kiu) = -Bl{u) (G^,i - 1) 
and 

It is then clear that the following decomposition holds 

Gl{u) - G^{u) = B^iu) + Rn{u)+Ql{u) _ ^21) 

Since G^ ^ is independent of u, it follows from decomposition (21) that 

sup |G„(uj - G (u)\ < sup \B^(u)\ -\ — . [22) 
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The proof of Proposition 3.1 is split up into several lemmas, given hereafter, establishing re- 
spectively the convergence almost surely (a.s.) of G^i to 1 and that of B^{u), Rn{u) and Qn{u) 
(with rate) to zero. 

We start by the following technical lemma whose proof my be found in Ferraty et al. (2007). 

Lemma 5.1 Assume that conditions (H1),(H2) hold true. For any real numbers j > 1 and k > 1, 
as n ^ cxD, we have 

(i) ^E[Ai(x)]=M,5(x) + o(l) 

(u) ^(E(Ai(x)))'= = MiV(:r) + o(l). 

Lemma below gives the convergence rate of the quantity G^ ^ ■ 
Lemma 5.2 Under assumptions (H1)-(H2) and condition (10)(i), we have 



Lr„ 1 - i — Ua.s 



J n4>{h) I 
Proof of Lemma 5.2. Let us denote by 

1 " 



n 



A,(x) 



where Ln,i{x) = A*(x) — E(A*(x)) and A*(x) = wKlx))' '^'-' ^PP^y ^^^ exponential inequality given 
by Corollary A.8(i) of Ferraty and Vieu (2006) in Appendix A we have first to show that for all 
m > 2 there exist a positive constant Cm such that E|L™^(3;)| < Cma'^^'^~^\ We have 



m—k 



E(|L„,i(x)r) < C^ [^JE [iAtix)f] [E(A*(x))] 

Then using Lemma 5.1 we get E (|Ln,i(x)p) < Crnmaxfc=o,i,...,m(</'(/i))^"'' < Cm(0(/i))^"'". There- 
fore, we have a^ = ((^(/i))^-*^. Now, for all e > 0, we have 



K,|>e) < 2exp 



ne^ 



2(^(/i)(l + e)J ■ 



The desired result follows from Borel Cantelli Lemma by choosing e = eoy^log n/ncl){h) where eo is 
a large enough positive constant. ^ 

The following lemma describes the uniform asymptotic behavior of the conditional bias term 
B^{u) as well as that of Rniu) and Qniu) with respect to u. 
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Lemma 5.3 (i) Under conditions (H1)-(H2) (H3)(i), we have 

suY>\Bl{u)\ = Oa.s.{h^)- 



nei 



(ii) If in addition that (H1)-(H2) hold true and condition (10) is satisfied, we have 



^nv\K{u)\=Oa.s\h\P^\ 



Proof of Lemma 5.3. Recall that 

B^^{u) = Gl^{u)-G%u). 
Conditioning by X and using the definition of G^{u) and condition (H3)(i), one has 



\B'n{u) 



1 



EAi(x) 
1 



E{Ai(x)E[||yi - u\\ I X]] - G^(«) 
E{Ai(x)(G^(n)-G^(u))} 



< 



EAi(x) 
sup \G'''{u)-G''{u)\ = Oa.s.{h^) 

x'eB{x,h) 



The later quantity is independent of u, this leads to sup^gj^d \B^{u)\ = Oa.s.{^^)- 

Now, to deal with the quantity R^{u), write it as -R^(m) = —B^{u) [G^ j^ — l) . Therefore 

sup|i?^(n)|= sup\B^{u)\\G-;,^,-l\. 
The statement (24) follows from (23) combined with Lemma 5.2. 



(23) 



(24) 



□ 



Lemma 5.4 Under assumptions (H1)-(H2), (H4)(i), conditions (10) and (11) we have 

sup |G'^,2(«) - ^^,2(^)1 = Oa.s. 



logn 



y n(j){h) j 
Proof of Lemma 5.4. For u G M'^ and r > 0, let 

S{u,r) = {u :u GM'^, ||u'-n|| < r}, 

be the sphere of radius r centered at u. Let [—n'^,n'^']'^, for 1/2 < 7 < 2, be an interval of M"^. Divide 
[—n"',n"'] into kn subintervals each of length bn = [2n"'/kn] (where [t] is the integer part oft). Since 
the set S{0,rf) = {u' : \\u'\\ < n"^} is compact, it can be covered by A:^ bounded hypercubes of the 
form 

Sn,j ■■= S{uj,bn) = {u :\\u- UjW < bn}, j = l,...,kn. 
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We have 



sup |G^,2(^)-G„,2('")I 

\\u\\<n-< 

< max sup \Gf,2{u)-G^2i'^j)\+ max |G^ 2(«j) - G'n,2(^^j)l 
+ max sup \Gl2iu) - G^ 2(^*j)l := -^n,l + ^,2 + -^n,3- 



Observe now that 

sup |G^,2(u)-G'^,2(%)l < 



UGSn 



1 " 

— — — — --V sup \\Yi - u\\ - \\Yi - UjW Ai{x) 



1 



and 



sup |G„_2(^)-G'n,2(^^i)l < IE 

"W fc I-? 7J, ^ J 



sup |G^,2(^)-G^,2(%) 

ueSn.j 



bn- 



If we denote by a„ = yjn(l){h)/ log n the convergence rate, one gets by Lemma 5.2 

a„(/n,i + /n,3) = 0{anhn{l + G^,i)) = 0(a„6„) = 0(a„nVA:^). 
The choice of kf^ = [on^''' log n] imphes that 

an(/n,i +^,3) = 0(1). 

In order to evaluate the term In,2i let us denote by 

A.(x) 



and 



Then, we have 



ZnAx) = WYi - ^j l|A* (x) - K [||yi - Uj\\A\{x)] . 



—X 1 " 

Gn^2{Uj) - Gl^2{Uj) = -^ Zn, 



dx). 



j=l 



For all m G N — {0}, observe that 

ZnA^) = E (Tj (11^* - ^M*(^))' i-^r-' m\\Yi - u,\\AUx))] 



-k I w^ra—k 



(25) 



(26) 



fc=0 
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In order to apply an exponential type inequality, we have to give an upper bound for E {\Zn,i{x)\"^). 
It follows from the above inequality that 



Ei\z^,,{x)n<cj2 



k=0 



m 



E 



Y^-u,\\AUx))'']mm-u,\\AUx))] 



m~k 



On the other hand, we have for any k > 2 



E 



\Y,-u,\\At{x))' 



E 
E 



(At(x))'=Gf(^,) 



Using the first part of condition {H4:){i), which implies that Gf{uj) is bounded uniformly for all j, 
one may write 



E 



\Y, - u,\\Atix)Y 



< E 



\k\/^X 



{At{x)r\Gt^{u,) - GUuj)\ + GUujMiAUxm 



< m^ti^))' 



max sup \G^ (uj) — G^{uj)\ +maxG^(uj) 
J x'eB(x,h) i 



< CqE 



{A\{x)f 



where Cq is a positive constant. Moreover, we have E {\\Yi — Uj\\A\{x)) = 0{\) uniformly in j since 
E [A2^(x)] = 1 and sup^E(||yi — u\\ \X) < oo in view of condition (2). 

Therefore [E(||yi - Uj\\A\{x))\^-^ = 0(1). 
Next, applying Lemma 5.1, one may write 



E 



{A\{x)f 



(Hh)) 



i-fc 






g^-\x)+o{l] 



Thus 



E(|Z„,,i(:E)r) < Cm max (0(/i)) 

fc=0,l,...,m 



1-fc 



where Cm is a real positive constant depending on m. Because (t){h) tends to zero as n goes to 
infinity, it comes that 

E(|z„,i(x)r) = o((c/>(/i))i— ). 

Now, applying Corollary A. 8 — i in Ferraty &: Vieu (2006) fcj^ times with a? = {4>{h))^^ we obtain, 
by choosing 

e = en = 3eo-v/^ where Vn = (a^logn)/n = log n/{n(f>{h)) — > as n — )■ cx). 
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that 



IP(|/n,2|>e) < 2A:^exp(-eglogn 
One may choose eo large enough such that 



2(1 + eo^) 



< 2A;^n-'o. 



^IP(|/n,2| >e)<00. 



We conclude by Borel-Cantelli lemma and (26) that 



a„ sup \Gn2{u) - Gn2iu)\= Oa.s{an\/v^) = Oa.s{l) 

||n||<nT 



Next, we have 



sup a„|G^ 2(^) - G'n,2(^)l < sup an\Gf,^2iu) - G'„,2(")l + sup Q„|G^ 2(«) - Gn,2iu) 

ueRd. \\u\\<n'y ll«ll>n^ 

= sup an\Gl^2{u)-Gl^2(.u)\+Oa.s.{l), 

WuWyn-y 
in view of the above result. Now, we have 

an sup |G'^,2(^) -G'^,2('")l 

n:||w||>?iT 



< a„ sup \Gn2{u)\+an sup |G^(u)| + a„sup |G^(u) — G„ 2(^) 

u:\\u\\>n"f ' M:||M||>raT « 



(27) 



The last term in (27) is zero for large n, since conditioning by X, one may write 

an\Gl^2{u) - G''{u)\ = an\B^{u)\ = Oa.s.{h'^an) =a.s. (1) 

in view Lemma 5.3 (i) whenever condition (10)(ii) is satisfied. For the second term in (27), we have 

On sup G^{u)<^^ sup ||u||G'^(n) = o(l), 

||«||>n^ "''^ ll«ll>"T 

whenever 7 > 1/2 and the condition (11) is satisfied. 
Moreover, we have for any e > 



< 



+ 



a„ sup |G^,2(^)I > e 

«:||«||>7iT 



1 



On sup 



u:\\u\\>n-f HiS.{l^i 



nE(Ai 



Y, \\Yi-u\\A,{x)\\>e/2 



i:\\Yi-u\\>n'r /2 



"" ^^P w//\ \ Yl \\Yi - u\\Ai{x)\\ > e/2 

u:\\u\\>n'y nlE(Ai) ^^ 



• — 'Jn.l ~T~ 'J-i 



nA -f- 'Jn,2- 



i:\\Yi-u\\<ni /2 



23 



To treat J„ i , denote by 



1 " 

An{uj) := {w : an sup - ^^ ll^i - ^ll^i > e/2}- 



H\>^'' '^ i=l:\\Y,-u\\>n-< /2 



The event ^nC"^) is nonempty if and only if there exists at least io {^ ^ io ^ n) such that 

llFio - n|| > nT/2. Thus "A„(w) / 0" C Uf^^{uj : \\Yi - u\\ > n'>/2}. It follows from Markov's 
inequality, if E(||yi — n||) < oo, that 

P(A„H /0) = 0(n-(T-i)) ^^^ ^P(^„(a;) /0) <oo, 

n 

whenever 7 > 1, which implies that Jn,i = Oa.sX^) by Borel-Cantelli Lemma. 
To deal with Jn,2, let us denote by 

Bn{uj) ■■= {uj : an sup V \\Yi-u\\Ai{x)\\>e/2}. 

ir\\v\\>n~i nMAij ,, ^ 

Bn{oj) is nonempty if and only if there exists at least io {1 < io < n) such that ||1^„ —u\\ < 11^/2. 
The later inequality implies that IJI^q — m|| — ||u|| < whenever ||n|| > rf' . Moreover, we have (by 
triangle inequality), whenever the above conditions are hold, that 



l>^.oll> 



\Yig -U\\ -\\U\ 



Via -till + ||n|| >n^/2. 



Therefore, 

"Bn{oj) / 0" C {3io : 1 < io < 1, ||l^*oll > ™V2}. 

We conclude as above that Jn,2 = Oa.s.{^) whenever i?(||li||) > 00 and 7 > 1. 

This ends the proof of Lemma 5.4. | — | 

Lemma 5.5 Under assumptions (H1)-(H2), (H4)(i) and condition (10) (i), we have 

Proof of Lemma 5.5. In order to check the statement (28), recall that 

Qn W = {Gl,2{^) - Gl2{n)) - G^u) (G^,i - l) . 
The result follows then from Lemmas 5.2 and 5.4. | — | 

Proof of Proposition 3.1. The proof follows from Lemmas 5.2, 5.3, 5.4 and 5.5. 
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n 

Proof of Theorem 3.2. 

We have from the definitions of fJ,{x) and ^n{x) and the existence and the uniqueness of these 
quantities that: 

G^(/x(x)) = inf G^(u) and G^(//„(x)) = inf G^(n). 

It follows then 

\G-{^,{x))-G^{^,n{^))\ < \G^{^,{x))-Gl{^in)\ + \Gl{^,n{x))-G-{^,n{x))\ 

= I _ (_ inf G^(u) + inf Gl{u))\ + \Gl(^in{x)) - G%^in{x))\ 



= I - sup G^{u) + sup Gl{u)\ + |G^(/u„(x)) - G"(/i„(x))| 

< sup \G^{u) - Gl{u)\ + |G?^(/x„(x)) - G-{^^n{x))\ 

< 2 sup |G^(u)-G^(u)|. (29) 

Moreover, since for any fixed x G J^, the function G^(-) is uniformly continuous and because ij-{x) 
is the unique minimizer of the function G^{-), we have then, for any e > 0, 

inf G'^iu) > G^(/i(x)), (30) 

u:\\ iJ,{x) —u\\>e 

which means that there exists for every e > 0, a number 7/(e) > such that G''^{u) > G^(/i(j;))+r/(e) 
for every u such that ||^(2;) — u\\ > e. This implies that the event {||//(3;) — /x„(x)|| > e} is included 
in the event {G^{^Xn{x)) > G^{^{x)) + r/(e)}. 
Using inequality (29) we get 

Y,^{\\^^n{x)-^^{x)\\>e) < Y.¥{G-{f^n{x))>G-ifiix)) + 7^{e)) 

n>l n>l 

< j;p(sup|G?^(n)-G^(u)|>r?(e)/2l <oo, 

n>l V^^" J 

similarly to the proof of the Proposition 3.1. The statement (12) follows then from an application 
of Borel-Cantelli Lemma. | — | 

Proof of Proposition 3.2 

To prove Proposition 3.2, it suffices to see that 

\\m{Uj))-H^{m < m{Uj))-H:{^.)\\ + m{f^)-H^fi)\\. (31) 
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Concerning the first term, observe that 



where 



and 



^ri 



•^n ~T~ i^rii 



Vd ^\\\Yi-^i\\-m-in 



(32) 



Ai(x) 



nE(Ai(x))^ \\Yi-ii\\\\Y,-in 



^"■"nE(Ai(x))^^' 



X 



1^. - en(j)ll U{y^ - /") U^{y^ - /«) - r* - ^11 U{y^ - UJ)) U'^iX: - UJ)) 



Using Theorem 3.2 and the triangular inequahty we can easily see that An = Oa.s.{^) ^ nEfA (x)) X^ILi ny. ip - 
Combining Markov and Cauchy-Schwarz inequalities and making use of the assumption H3-(iii), 
we can easily prove that ^^i^ (^^^^ Yll=i ny., ip = Cp(l)- Then we conclude that An = op(l). 

For the second term Bn of the inequality (32), we have by triangular inequality and the fact 
that \\U{Yi-e)\\ = 1, that 



\yi - um u{Yi - /i) w {Yi -ix)- \Yi - /ill u{Y, - uj)) u' {y - uj)) 



\Yi-U3)\\-\\yi-t^\ 



+ \\y^-^^\ 



U{Yi - fi) W {Yi - /i) - U{Y, - inij)) W {Yi - UJ)) 



< 
< 



M - UJ)\\ + \\y^ - /"ll X U{Y, - IJ) W (y, -ll)- U{Yi - UJ)) U' {y^ - UJ)) 



Since 



U{Yi - fi)U^(Y, -fi)- U{Yi - U3))U^{yi - UJ)) = MYi - l^) -U{Y, - e„(j)) ] Z^^(>^i - ^) 

+ u{Yi - uj)) [u^{yi - /^) - u^{yi - Uj)) ] , 



and \\Iy((Yi — p) — UiYi — Cn(j))|| < 2t— " / ^ " , we can conclude, by using Theorem 3.2, that 

n - (^\ 1 >^ Aj(x) 

iJn — Oa.sX^j X / w / , ii-iA 119 

nE(Ai(x)) -^^ li - /x H 

Finally, using the same arguments as above (concerning the proof of the term An), we get Bn = op(l) 
and this is allows us to conclude that ||i7^(.^„(i)) — i/^(/x)|| = op(l). Now we are interesting to the 
second term of the right side term of (31). Write 

Hl{^,) - H-if^) = H^if,) - E[^:(/.)] +E[F„-(/.)] - H-if,) . 



K„ 



K„, 
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We have to show that each term Kn^i (i = 1, 2) is asymptoticany neghgible. We have 



d d 



\K^,^f = tr{Kl^Kn,i) = Y,Y.^l 



•J 



where {Zkj)i<k,j<d is the general term of the matrix K'^^Kn^i which may be can be written as 



■'k,j 



^^^^^-^X^[A^fc,,(y„/i)A,(x)-E(A^fc,,(y„^)A,(x))]. 



Using the assumption (H3)-(iv), Lemma 5.1 and corohary A. 8 of Ferraty and Vieu (2006), we can 
easily prove that for all 1 < k,j < d, Z^j = op(l). 
To handle Kn^2-, observe that 



\K. 



n,2| 



E 



< 



1 



E(Ai(x)) 



nE(Ai(x)) J ^"^^^ 

E(||i7^K/^)-i^"(/^)l|Ai(x)) 



< sup \\H^{pi)-H-{ll)\\=Oa.s.{^) 

x'eB{x,h) 



in view of condition (H3){ii) 



□ 



Lemma 5.6 Under hypothesis (H1)-(H2) and (H4)(ii), and if for any S > 0, {n4){h)) ^''^ — )• 0, we 
have 



V 



VuGli^,) ^AA,(0, S-(/i)). 



^/n(t>{h) (VuGli^i) - E 
where S^(/u) is the limiting covariance matrix ofVuG^^fi) — E VuG^(/^) 
Proof of Lemma 5.6. Let's denote by 



A, 



VW) 



xU{Yi-fi)A,{x) 



^J^^) huGliii) - E \yuGl{^i) 



1 



1 



n 



E(Ai(x)) 
Then 

4=1 ^ i=l 

From the Cramer- Wold device, Lemma 5.6 can be proved by finding the limit distribution of the 
real variables sequence -j= X^"=i ^* j4j, for all £ G M'^ satisfying ||/|| ^ 0. 

Because the random variables (.^Ai, . . . , tA^ are i.i.d. with zero mean and asymptotic variance 



aHx) 



lim Var 

n— >oo 



1=1 / 
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The result may be obtained by applying the Liapounov Central Theorem Limit. For this propose, 
we have to prove the following Lindeberg condition: 



V(5>0 



It is easy to see that: 



nfS^(/u)^ 



-(2+5)/2 



Y^W^i 



2+5 



as n — > oo. 



i=l 



nfS^(AiK 



'^'■''^^'f;E|^*i,|2+^ = n-"^eT.-{^,yy'^^^'^'^^tM\^+'. 



i=l 



Moreover, using Cr and Jensen inequalities, we obtain 






E 



I' {U{Yi - ^i)f+^ X ^l+\x) 



- (EAi(x))2+'5^^ 

(</<(/l))(2+'5)/2 



2+5 



/^l^\x) E \eU{Yi-^i)\'^' \X 



--w^+sii^) 



< c 



(EAi(x))2+'5 



\2+5 



E(Ai(x))^+'' sup |VFf+5(/.)-Tyf+5(/z)|+Ty|V5(M)E(Ai(x)) 

x'eB{x,h) 



2+5 



It follows then, by hypothesis (114) (ii) and Lemma 5.1, that 



< c' 



(cPih)) 



{2+5)/2 






[<Pih)iM^2+S)/29ix)+oil))] 



Finally, since {i^T,^{fi)i) (2+5)/2 ^g finite, it comes that 

-(2+5)/2 



nfS^X/"K 



j=i 



Y,WA\^+' = o (M(/i)-^/2) ^ ^(^)^ 



because n(f)[h) — t- oo as n — t- oo. This implies the Lindeberg condition, which completes the 
proof of the Lemma. q 

The following Lemma gives the analytic expression of the matrix S^ (/i) . 
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Lemma 5.7 Under conditions (H1)-(H2) and (H4)(ii), we have 

M2 



a (x) = lim Var 



Jn ^-^ I 



Mfg{x) 



eT.''{^)L 



Proof of Lemma 5.7. Since the random variables (^*^i)j=i,...,n are i.i.d. with mean zero, it fohows 
that 

<t2(x) = Hm Far ( ^ V£* li I = hm Var{tAi) = Hm E {{i'^Aif) . 
On the other hand, making use of the properties of conditional expectation one may write 



E 



(£*^i 



(^«[A..'«(V.-.)]^ 



(EAi): 



E 



. 2ti/X 



Aiw^^{^l) 



Making use of the condition (H4)(ii) and the fact that the functions Wf (•) is bounded, we obtain 



W|(/x) + sup\W^{f,)-W^ifi)\ 



= M^|(/.)E(A?)+o(E(A?)). 
Using Lemma 5.1, one may see that 



Hh) 



(EAi) 



^E{Al 



M2 



M(g{x: 



+ o{l) 



Therefore, 



a\x) 



M2 



Mfgix) 



W|{^Ji)+o{l). 



Proof of Proposition 3.3. For each x G J^, since (^i, i^j)i=i,...,n are i.i.d., we have 

e fxl - E Tv G-iu)] - nU{Y^-^^)A,{x)] 

Bn[x) - E yVuGn{^^)\ - ]E(A,(x)) 

By conditioning with respect to real variable d{x^Xi) and using condition (H5), we have 

E 



Bn{x) 



K{^i2^\^l;{d{x,Xi)) 



¥.{K 



d(x,Xi) 
h 



Integration with respect to the distribution of the real variable d{x,Xi) shows that 



n 
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yli :=E 



i^(feZl))^(d(x,Xi)) 



K{t)^{th)dF{th), 



where F is the cumulative distribution function of the real random variable d{x, X). On the other 
hand, Taylor series expansion of the function ij) up to the order one in the neighborhood of t = 
gives ilj{th) = thVip{0) + Od{h). Let us denote by 0^(1) (resp. C'd(l)) a d-dimensional vector where 
each component equal to o(l) (resp. 0(1))- 
Therefore, we have 



Ai = hVij{0) I tK{t)dF^{th) + Od{h) I K{t)dF{th) 

JO 

K{l)F{h)- I {sK{s))'F{sh)ds + Od{h) 
Jo J 



Using hypothesis {H2){i) — {ii) we get 



K{l)F{h)- I K'{s)F{sh)ds 
Jo 



Ai = hVij{0)K{l){^{h)g{x) + o{(l){h)))-hVi;{0) {sK{s)y {(t){sh)g{x) + o{(t){hs))) ds 

Jo 

+o(/i)i^(l)((/)(%(x) + o((/)(/i)))-Od(/i) / K\s)i^ish)gix)+o{cP{sh)))ds 

Jo 



hcl){h)g{x)Vi;{0) 



K{1) 



{sK{s)y{To{s)+o{l))ds 







+ hc^{h)K{l)odil) 



-Od{h^{h)) / K'{s){To{s)g{x) + o{l))ds 
Jo 

»1 



hct>[h)g{x)Vil,{d) 



K{1) 



{sK{s))'To{s)ds 







+ Or-{h^{h)) 



Thus, making use of the Lemma 5.1, we obtain 



Bn{x) 



/iVV'(O) 
Ml 



K{1)- / {sK{s))'TQ{s)ds + Oa.s.{^) 




□ 



Proof of Theorem 3.3 

Part (i) follows from Proposition 3.2, decomposition (17), Proposition 5.6 and Lemma 5.7. 
Part (ii) follows from Proposition 3.3 combined with the condition y/n<p{h)h — > as n — ?■ oo. 

n 

Proof of Corollary 3.5. Let us denote by 

r-(M) = [s-(^)]-i/2if-(/x), r:(^„) = [s?^(/.„)]-V2/^-(^„) 

Mi,n 



and 



Vnif^n) = J^ ^JnF^,,n{h) T^{fin) (/^n " /x) . 
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Write 

:= yix^yn,2- (33) 

Making use of Theorem 3.3 part (ii), the term V'^2 converges in distribution to AA(0, Id)- 

Now to get the result of the corollary it suffices to show that the first term V^^ converges to 1 in 
probability. Following the same arguments as in Laib and Louani (2010) combined with (H1),(H2), 
one gets 






I -I p p p 

\J nFx^nih) {n(j){h)g{x)) — > 1, Mi^„ — > Mi and M2,n — > M2, as n — )• 00. 



Now, we have to establish the consistency of T^{fj,n)- To do that, we will study separately the 
consistency of each term of T^{^n)- Let us start by H^{fin)- For this, write 



According to Theorem 3.2, Proposition 3.2, Lemma 5.2 and the fact that the matrix H^{fj,) is 
bounded, we can conclude that H^{^n) converges, in probability, to H^[^). 

The second term S^(/i„), can be treated similarly. Finally, this leads to the convergence in 
probability of T^{^ln) to r^(/i). □ 
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